Optimal Sparse Regression Trees

نویسندگان

چکیده

Regression trees are one of the oldest forms AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within large literature on regression trees, there has been little effort towards full provable optimization, mainly due to computational hardness problem. This work proposes dynamic programming-with-bounds approach construction provably-optimal sparse trees. We leverage novel lower bound based an optimal solution k-Means clustering algorithm dimensional data. often able find in seconds, even challenging datasets that involve numbers samples highly-correlated features.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimal Partitioning for Classification and Regression Trees

In designing a decision tree for classification or regression, one selects at each node a feature to be tested, and partitions the range of that feature into a small number of bins, each bin corresponding to a child of the node. When the feature’s range is discrete with N unordered outcomes, the optimal partition, that is, the partition minimizing an expected loss, is usually found by an exhaus...

متن کامل

Multivariate Dyadic Regression Trees for Sparse Learning Problems

We propose a new nonparametric learning method based on multivariate dyadic regression trees (MDRTs). Unlike traditional dyadic decision trees (DDTs) or classification and regression trees (CARTs), MDRTs are constructed using penalized empirical risk minimization with a novel sparsity-inducing penalty. Theoretically, we show that MDRTs can simultaneously adapt to the unknown sparsity and smooth...

متن کامل

Nearly Optimal Minimax Estimator for High Dimensional Sparse Linear Regression

We present estimators for a well studied statistical estimation problem: the estimation for the linear regression model with soft sparsity constraints (`q constraint with 0 < q ≤ 1) in the high-dimensional setting. We first present a family of estimators, called the projected nearest neighbor estimator and show, by using results from Convex Geometry, that such estimator is within a logarithmic ...

متن کامل

Globally Optimal Fuzzy Decision Trees for Classification and Regression

ÐA fuzzy decision tree is constructed by allowing the possibility of partial membership of a point in the nodes that make up the tree structure. This extension of its expressive capabilities transforms the decision tree into a powerful functional approximant that incorporates features of connectionist methods, while remaining easily interpretable. Fuzzification is achieved by superimposing a fu...

متن کامل

Outlier Detection by Boosting Regression Trees

A procedure for detecting outliers in regression problems is proposed. It is based on information provided by boosting regression trees. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate after removing it. The selection criterion is based on Tchebychev&rsquo;s inequality applied to the maximum over the boosting iterations of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2023

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v37i9.26334